Invariance of Weight Distributions in Rectified MLPs
نویسندگان
چکیده
An interesting approach to analyzing and developing tools for neural networks that has received renewed attention is to examine the equivalent kernel of the neural network. This is based on the fact that a fully connected feedforward network with one hidden layer, a certain weight distribution, an activation function, and an infinite number of neurons is a mapping that can be viewed as a projection into a Hilbert space. We show that the equivalent kernel of an MLP with ReLU or Leaky ReLU activations for all rotationally-invariant weight distributions is the same, generalizing a previous result that required Gaussian weight distributions. We derive the equivalent kernel for these cases. In deep networks, the equivalent kernel approaches a pathological fixed point, which can be used to argue why training randomly initialized networks can be difficult. Our results also have implications for weight initialization and the level sets in neural network cost functions.
منابع مشابه
Neural network acoustic models for the DARPA RATS program
We present a comparison of acoustic modeling techniques for the DARPA RATS program in the context of spoken term detection (STD) on speech data with severe channel distortions. Our main findings are that both Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) outperform Gaussian Mixture Models (GMMs) on a very difficult LVCSR task. We discuss pre-training, feature sets and ...
متن کاملNonlinear activations for convolutional neural network acoustic models
Following their triumphs in visual recognition tasks, convolutional neural networks (CNNs) have recently been used to learn the emission probabilities of hidden Markov models in speech recognition. The key distinction of CNNs over deep neural networks (DNNs) is that they leverage translational invariance in the frequency domain, such that weights are shared and there are significantly fewer par...
متن کاملThermoelastic Vibration of Temperature-Dependent Nanobeams Due to Rectified Sine Wave Heating—A State Space Approach
In this study, the second type of Green and Naghdi's thermoelasticity theory is applied to present the vibration of a nanobeam subjected to rectified sine wave heating based upon the nonlocal thermoelasticity theory. Both Young's modulus and thermal conductivity are considered to be linear functions of the temperature. The Laplace transform domain is adopted to solve the governing partial diffe...
متن کاملPerformance analysis of a MLP weight initialization algorithm
The determination of the initial weights is an important issue in multilayer perceptron design. Recently, we have proposed a new approach to weight initialization based on discriminant analysis techniques. In this paper, the performances of multilayer perceptrons (MLPs) initialized by non-parametric discriminant analysis are compared to those of randomly initialized MLPs using several synthetic...
متن کاملData augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages
This paper presents the progress of acoustic models for lowresourced languages (Assamese, Bengali, Haitian Creole, Lao, Zulu) developed within the second evaluation campaign of the IARPA Babel project. This year, the main focus of the project is put on training high-performing automatic speech recognition (ASR) and keyword search (KWS) systems from language resources limited to about 10 hours o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.09090 شماره
صفحات -
تاریخ انتشار 2017